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Abstract. A well-known result by Frick and Grohe shows that decid- 
ing FO logic on trees involves a parameter dependence that is a tower of 
exponentials. Though this lower bound is tight for Courcelle's theorem, 
it has been evaded by a series of recent meta-theorems for other graph 
classes. Here we provide some additional non-elementary lower bound 
results, which are in some senses stronger. Our goal is to explain com- 
mon traits in these recent meta-theorems and identify barriers to further 
progress. 

More specifically, first, we show that on the class of threshold graphs, 
and therefore also on any union and complement-closed class, there is no 
model-checking algorithm with elementary parameter dependence even 
for FO logic. Second, we show that there is no model-checking algorithm 
with elementary parameter dependence for MSO logic even restricted 
to paths (or equivalently to unary strings), unless EXP=NEXP. As a 
corollary, we resolve an open problem on the complexity of MSO model- 
checking on graphs of bounded max-leaf number. Finally, we look at 
MSO on the class of colored trees of depth d. We show that, assuming 
the ETH, for every fixed d > 1 at least d+1 levels of exponentiation are 
necessary for this problem, thus showing that the (d-l-l)-fold exponential 
algorithm recently given by Gajarsky and Hlineny is essentially optimal. 

1 Introduction 

Algorithmic meta-theorems are general statements establishing tractabil- 
ity for a whole class of problems (often defined by expressibility in a 
certain logic) in some class of inputs (usually a family of graphs). By far 
the most famous and celebrated theorem in this area is a twenty-year 
old result due to Courcelle [2] which states that all problems expressible 
in monadic second-order logic (MSO2) are linear-time solvable on graphs 
of bounded treewidth. Thus, in one broad sweep this theorem establishes 
that a large number of natural well-known problems, such as 3-COLORING 
and Hamiltonicity, are tractable on this important graph family. Much 
work has been devoted in recent years to proving stronger and stronger 
meta-theorems in this spirit, often extending Courcelle's theorem to other 
graph classes (see e.g. [3|7|5) or |12|13) for some great surveys). 
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The most often cited drawback of Courcelle's theorem has to do with 
the "hidden constant" in the algorithm's hnear running time. It is clear 
that the running time must somehow depend on the input formula and 
the graph's treewidth, but the dependence given in Courcelle's theorem is 
in the worst case a tower of exponentials whose height grows with the size 
of the formula. Unfortunately, this cannot be avoided: Frick and Grohe |^ 
proved that the parameter dependence has to be non-elementary even if 
one restricts the problem severely by just looking at properties expressible 
in first-order logic on trees. 

This lower bound result, though quite devastating, has proven very 
fruitful and influential: several papers have appeared recently with the 
explicit aim of proving meta-theorems which evade it, and thus achieve a 
much better dependence on the parameters. Specifically, in jT5] algorith- 
mic meta-theorems with an elementary parameter dependence are shown 
for vertex cover, max-leaf number and the newly defined neighborhood di- 
versity. A meta-theorem for twin cover was shown by Ganian [lOJ. In addi- 
tion, meta-theorems were shown for tree-depth by Gajarsky and Hlineny 
[9] and for the newly defined shrub-depth (which generalizes neighborhood 
diversity and twin cover) by Ganian et al. 

Thus, together with improved meta-theorems, these papers give a new 
crop of graph complexity measures, some more general than others. It 
becomes a natural question how much progress we can hope to achieve 
this way, that is, how far this process of defining more and more general 
"graph widths" can go on before hitting some other natural barrier that 
precludes an elementary parameter dependence. Is simply avoiding the 
class of all trees enough? 

This is exactly the question we try to answer in this paper. Towards 
this end we try to give hardness results for graph families which are as sim- 
ple as possible. Perhaps most striking among them is a result showing that 
not only is avoiding all trees not enough but in fact it is necessary to avoid 
the much smaller class of uncolored paths if one hopes for an elementary 
parameter dependence. As an example application, this almost immedi- 
ately rules out the existence of meta-theorems with elementary parameter 
dependence in any induced-subgraph-closed graph class with unbounded 
diameter and any edge-subdivision-closed graph class. This explains why 
all recently shown meta-theorems we mentioned work on classes which are 
closed under induced subgraphs but have bounded diameter and are not 
closed under edge subdivisions. 

Our results can be summarized as follows. First, a non-elementary 
lower bound for model checking FO logic on threshold graphs is shown. In 
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a sense, this is a natural analogue of the lower bound for trees to the realm 
of clique-width, since threshold graphs are known to have the smallest 
possible clique-width. The proof is relatively simple and consists mostly 
of translating a similar lower bound given in |8] for FO model checking on 
binary words. However, the main interest of this result is that as a corollary 
we show that the complexity of FO model checking is non-elementary for 
any graph class closed under disjoint union and complement. This explains 
why, though some of the recent meta-theorems work on complement-closed 
graph classes (e.g. neighborhood diversity, shrub-depth) and some work 
on union-closed graph classes (e.g. tree-depth), no such meta-theorem has 
been shown for a class that has both properties. 

Our second result is that model checking MSO logic on uncolored paths 
(or equivalently on unary strings) has a non-elementary parameter depen- 
dence. This is the most technically demanding of the results of this paper, 
and it is proved under the assumption EXPt^NEXP. The proof consists 
of simulating the workings of a non-deterministic Turing machine via an 
MSO question on a path. Though the idea of simulating Turing machines 
has appeared before in similar contexts |14] . because the graphs we have 
here are very restricted we face a number of significant new challenges. 
The main tool we use to overcome them, which may be of independent in- 
terest, is an MSO formula construction that compares the sizes of ordered 
sets while using an extremely small number of quantifiers. In the end, this 
result strengthens both non-elementary MSO lower bounds given in [8] 
(for trees and for binary strings), modulo a slightly stronger complexity 
assumption. It also resolves the complexity of MSO model checking for 
max-leaf number, which was left open in [15]. As an added corollary, we 
give an alternative, self-contained proof of a result from [3J, stating that 
MSO2 model checking is not in XP for cliques unless EXP=NEXP. 

Finally, we study one of the recent positive results in this area by con- 
sidering the problem of model-checking MSO logic on rooted colored trees 
of height d. This is an especially interesting problem, since the (d-l- l)-fold 
exponential algorithm of [,9J is the main tool used in the meta-theorems of 
both [9j and [llj. We show that, assuming the ETH, any algorithm needs 
at least d + 1 levels of exponentiation, and therefore the algorithm of |9] 
is essentially optimal. The main idea of the proof is to "prune" the trees 
constructed in the proof from [8] and then use an appropriate number of 
labels to differentiate their leaves. 
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2 Preliminaries 

The basic problem we are concerned with is model-checking: We are given 
a formula <j) (in some logic) and a structure S (usually a graph or a string) 
and must decide if S |= (j), that is, if S satisfies the property described by 

Due to space constraints, we do not give a full definition of FO and 
MSO logic here (see e.g. (3). Let us just briefly describe some conven- 
tions. We use lower-case letters to denote singleton (FO) variables, and 
capitals to denote set variables. When the input is a graph, we assume 
the existence of an E{x,y) predicate encoding edges; when it's a string 
a -< predicate encodes a total ordering; when it's a rooted tree a C{x,y) 
predicate encodes that x is a child of y. Sometimes the input also has a set 
of colors (also called labels). For each color c we are given a unary predi- 
cate Pc{x). When the input is an uncolored graph that consists of a single 
path it is possible to simulate the -< predicate by picking one endpoint of 
the path arbitrarily (call it s) and saying that x -< y if all paths from s 
to y contain x. Thus, model-checking MSO logic on uncolored paths is at 
least as hard as it is on unary strings. In most of the paper when we talk 
about MSO logic for graphs we mean MSOi, that is, with quantification 
over vertex sets only. An exception is Corollary [4] which talks about MSO2 
logic, which allows edge set quantifiers. 

A graph is a threshold graph ([l]) if it can be constructed from Ki by 
repeatedly adding union vertices (not connected to any previous vertex) 
and join vertices (connected to all previous vertices), one at a time. Thus, 
a threshold graph can be described by a string over the alphabet {u, j}. A 
graph is a cograph if it is Ki, or it is a disjoint union of cographs, or it is 
the complement of a cograph. It is not hard to see that threshold graphs 
are cographs. From the definition it follows that any class of graphs that 
contains Ki and is closed under disjoint union and complement contains 
all cographs; if it is closed under the union and join operations it contains 
all threshold graphs. 

All logarithms are base two. We define exp*^'') (n) as follows: exp^^^ (n) = 
n and exp*^'^"'"^) (n) = 2^^^'' Then log^*^^ n is the inverse of exp(^)(n). 
Finally, log* n is the minimum i such that log^*^ n < 1. 

3 Threshold Graphs 

As mentioned, Frick and Grohe [8j showed that there is no FPT model- 
checking algorithm for FO logic on trees with an elementary dependence 



4 



on the formula size, under standard complexity assumptions. In many 
senses this is a great lower bound result, because it matches the tower 
of exponentials that appears in the running time of Courcelle's theorem, 
while looking both at a much simpler logic (FO rather than MSO2) and 
at the class of graphs with the smallest possible treewidth, namely trees. 

Courcelle, Makowsky and Rotics [Sj have given an extension of Cour- 
celle's theorem to MSOi logic for clique-width. The parameter dependence 
is again a tower of exponentials and, since trees have cliquewidth at most 
3 we already know that this cannot be avoided even for graphs of 

constant clique- width. Here we will slightly strengthen this result, showing 
that the non-elementary dependence cannot be avoided even on cographs, 
the class of graphs that has the smallest possible clique-width (that is, 
clique-width 2) without being trivial. We will heavily rely on a lower 
bound, due again to Frick and Grohe, on the complexity of model check- 
ing on binary strings. 

One interesting consequence of the lower bound we give for cographs 
is that it precludes the existence of an FPT algorithm with elementary 
parameter dependence for any graph class that satisfies two simple prop- 
erties: closure under disjoint unions and closure under complement. The 
reason for this is that if a class is closed under both of these operations 
and it contains the single-vertex graph, then it must contain all cographs 
(we will also show that the assumption that Ki is in the class is not 
needed). This observation helps to explain why, though some of the re- 
cent elementary model-checking algorithms which have appeared work on 
union-closed graph classes, and some work on complement-closed graph 
classes, no such algorithms are known for classes with both properties. 

The proof we present here is relatively simple and it relies on the 
following theorem. 

Theorem 1 (|8]). 

Unless FPT~AWl*], for any constant c and any elementary function 
f there is no model- checking algorithm for FO logic on binary words which 
given a formula (j) and a word w decides ifw \= (j) in time at most f{(j))\w\'^. 

We will reduce this problem to FO model checking on threshold graphs. 
This is quite natural, since the definition of threshold graphs gives a 
straightforward correspondence between graphs and strings. 

Theorem 2. Unless FPT—AW[*], for any constant c and any elementary 
function f there is no model- checking algorithm for FO logic on connected 
threshold graphs which given a formula (j) and such a graph G decides if 
G \= (j) in time at most /(i;^>)|G|'^. 
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Proof (sketch). 

We can encode an arbitrary binary string with a threshold graph by 
encoding each by two vertices uj and each 1 by three vertices ujj. We 
also need to encode a FO formula for strings to one for graphs. Existential 
quantification in the string is simulated by existential quantification in 
the graph where we ask that the vertex selected is a union vertex (this 
can be expressed in FO logic). The predicates x ^ y and Pi{x) can be 
implemented by checking if there are join vertices connected to x but not 
y in the first case, and checking for two join vertices connected to x but 
not later union vertices in the second. □ 

Corollary 1. Let C he a graph class that is closed under disjoint union 
and complement, or under disjoint union and join. Unless FPT~AW[*], 
for any constant c and any elementary function f there is no model- 
checking algorithm for FO logic on C which given a formula (j) and a graph 
G ^ C decides if G \= (f> in time at most f((j))\G\'^. 

4 Paths, Unary Strings 

The main result of this section is a reduction proving that, under the 
assumption that EXPt^NEXP, there is no FPT model-checking algorithm 
for MSO logic with an elementary parameter dependence on the formula 
even on graphs that consist of a single path, or equivalently, on unary 
strings. As a consequence, this settles the complexity of MSO model- 
checking on graphs with bounded max-leaf number, a problem left open 
in |15) . since paths have the smallest possible max-leaf number. Until now 
a similar result was known only for the much richer class of binary strings 
(or equivalently colored paths), under the weaker assumption that P^^NP 
[S]. It is somewhat surprising that we are able to extend this result to 
uncolored paths, because in this case the size of the input is exponentially 
blown-up compared to a reasonable encoding. One would expect this to 
make the problem easier, but in fact, it only makes it more complicated 
to establish hardness. 

Indeed, one of the main hurdles in proving a lower bound for MSO 
on unary strings, or paths, is information-theoretic. Normally, one would 
start with an NP-hard problem, and reduce to a model-checking instance 
with a very small formula (j). But, because the path we construct can 
naturally be stored with a number of bits that is logarithmic in its size, 
in order to encode n bits of information from the original instance into 
the new instance we need to construct a path of exponential size. Thus, 
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a polynomial-time reduction seems unlikely and this is the reason we end 
up using the assumption that EXP^^NEXP, instead of P^^NP. 

Our approach is to start from the prototypical NEXP-complete prob- 
lem: given n bits of input for a non-deterministic Turing machine that 
runs in time 2" , does the machine accept? We will use the input path to 
simulate the machine's tape and then ask for a subset of the vertices of 
this path that corresponds to cells in the tape where 1 is written. Thus, 
what we need at this point is an MSO formula that checks if the chosen 
vertices encode a correct accepting computation. 

Of course, to describe a machine's computation in MSO logic a sig- 
nificant amount of machinery will be needed. We note that, though the 
approach of simulating a Turing machine with an MSO formula has been 
used before (e.g. p4|). the problem here is significantly more challenging 
for two reasons: first, unlike previous cases the input here is uncolored, so 
it is harder to encode arbitrary bits; and second, there are (obviously) no 
grid-like minors in our graph, so it's harder to encode the evolution of a 
machine's tape, and in particular to identify vertices that correspond to 
the same tape cell in different points in time. 

Our main building block to overcome these problems is an MSO con- 
struction which compares the sizes of paths (or generally, ordered sets) of 
size n with very few (roughly 2*-^^'°^ "^) quantifiers. This construction may 
be of independent interest in the context of the counting power of MSO 
logic. We first describe how to build this formula, then use it to obtain 
other basic arithmetic operations (such as exponentiation and division) 
and finally explain how they all fit together to give the promised result. 

4.1 Measuring Long Paths with Few Quantifiers 

To keep the presentation simple we will concentrate on the model-checking 
problem on unary strings; formulas for MSO on paths can easily be con- 
structed as explained in section [2] We therefore assume that there is a 
predicate -< which gives a total ordering of all elements. 

Let us now develop our basic tool, which will be an MSO formula 
eqL^Pi, P2), where Pi,P2 are free set variables. The desired behavior of 
the formula is that if |Pi| = IP2I and |Pi| < L then the formula will be 
true, while on the other hand whenever the formula is true it must be the 
case that |Pi| = \P2\- In other words, the formula will always correctly 
identify equal sets with size up to L, and it will never identify two unequal 
sets as equal (it may however be false for two equal sets larger than L). 
Our main objective is to achieve this with as few quantifiers as possible. 
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We will work inductively. It should be clear that for very small values 
of L (say L = 4) it is possible to compare sets of elements with size at 
most L with a constant number of set and vertex quantifiers and we can 
simply make the formula false if one set has more than 4 elements. So, 
suppose that we have a way to construct the desired formula for some L. 
We will show how to use it to make the formula eqi'^ where L' > L ■ 2^. 
If our recursive definition of eqi' uses a constant number of copies of eqi 
then in the end we will have {eqil = 2*"^*^^°^ ^\ because for each level of 
exponentiation we blow up the size of a formula by a constant factor. 
This will be sufficiently small to rule out a non-elementary parameter 
dependence. 

Let us now give a high-level description of the idea, by concentrating 
first on the set Pi. We will select a subset of Pi, call it Qi, and this 
naturally divides Pi into sections, which are defined as maximal sets of 
vertices of Pi , consecutive in the ordering, with the property that either all 
or none of their vertices belong in Qi. We will make sure that all sections 
have length L, except perhaps the last, which we call the remainder (see 
Figure [T|. It is not hard to see that this structure can be imposed if the 
predicate eqi is available. We do the same for P2 and now we need to 
verify that the two remainders have the same length (easy with eqi) and 
that we have the same number of sections on Pi and Pj. 
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Fig. 1. An example of the counting structure imposed on a set for L = 4. 
Assume that the elements of the set are displayed here in the ordering 
given by -<. We select a set of elements (indicated by boxes) to create 
sections of elements of size L, and a remainder set (indicated by a dashed 
box). We then select some elements from each section appropriately (in- 
dicated by solid grey filling) so that we simulate counting in binary. For 
L = 4 this method can count up to length L2^ = 64 before overflowing. 

Now we could naively try to count the number of sections by selecting 
a representative from each and forming a set. This would not work since 
the number of sections is at most 2^ and the inductive hypothesis only 
allows us to use eqi to compare sets of size L. Thus, we have to work a 
little harder. 

We need to count the number of sections this structure creates on Pi 
(which may be up to 2^). We select another subset of Pi, call it Bi. The 
intuition here is that selecting Bi corresponds to writing a binary number 
on each section, by interpreting positions selected in Bi as 1 and the rest 
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as 0. We will now need to make sure that each section encodes the binary 
number that is one larger than the number encoded by the immediately 
preceding section. This is achievable by using eq^ to locate the elements 
that represent the same bit positions. We also make sure that there was 
no overflow in the counting and that counting started from zero, that is, 
all sections have some vertex not in Bi and the first has no vertices in Bi. 

Finally, assuming that the above counting structure is correctly im- 
posed on both Pi and P2 all that is left is to take the last sections on both 
and compare them. If the same binary number is encoded on both then 

\Pl\ = \P2V 

A formal definition for eqi and all other formulas of this section is 
given in the appendix. 

Lemma 1. Let L > 2 be a power of two. Then we can define a formula 
sQl{Pi, P2) such that the formula is true if and only if \Pi\ = IP2I < 
LlogL. Furthermore 

Before we go on, we will also need formulas to perform some slightly 
more complicated arithmetic operations than simply counting. In particu- 
lar, we will need a formula expiiPi, P2), which will be true if IP2I = 2l^iL 
The trick we use for this is shown in Figure [2] The idea is that we select 
a subset of P2, call it Q, which marks out a set of |Pi| -|- 1 elements whose 
consecutive distances form a geometric progression with ratio 2. 
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Fig. 2. An example where the set on the left has size 4 and we verify that 
the set on the right has size 2^. First we select a subset of elements on 
the right of size one more than the set on the left. Then we ensure that 
distances between consecutive elements are doubled at each step and the 
first and last element are selected. 

Finally, we will need the following MSO formulas roo t'^^\Pi,P2) which 
checks if IP2I = assuming /c is a power of two; div{Pi, P2) which 

checks is |Pi| divides IP2I; and modL{Pi, P2, R) which is true if IP2I mod 
iPil = \R\ and |Pi| < L. 



4.2 Hardness for Unary Strings and Paths 

Theorem 3. Let f be an elementary function and c a constant. If there 
exists an algorithm which, given a unary string w of length n and an MSO 
formula (j) decides if w \= (j) in time /(|i;^>|)n'^ then EXP~NEXP. 
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Proof (sketch). 

Suppose we are given a non-deterministic Turing machine running in 

k 

time T = 2" for k a power of two, and therefore using at most T cells 
of its tape, when given n bits of input. We are also given the n bits of 
input, which we interpret as a binary number I < 2"^. We must decide if 
the machine accepts using the hypothetical model-checking algorithm for 
unary words. 

We construct a path of size T^(2/-l- 1). Using div we locate a sub-path 
of size / (finding the largest odd divisor of the path) and a sub-path of 
size T^, which we divide into sections of size T using root. These sections 
will represent snapshots of the tape during the machine's execution. We 
ask for a subset of the elements of the path that encodes the tape cells on 
which the machine writes 1. Now we need to check two things: first, that 
the bits at the beginning of the tape correspond to the input. This can be 
done with exp and some arithmetic on the size of /. Second, that the bits 
selected encode a correct computation. This is done by checking all pairs 
of elements from consecutive snapshots that correspond to the same cell. 
These are identified using cql (their distance is exactly T). □ 

Corollary 2. Let f be an elementary function and c a constant. If there 
exists an algorithm which, given a path P on n vertices and an MSO 
formula (f) decides if P \= <j) in time f{\(j)\)n^ then EXP—NEXP. 

Corollary 3. Let f he an elementary function, c a constant, and C a 
class of graphs closed under edge sub-divisions. If there exists an algorithm 
which, given a graph G £ C on n vertices and an MSO formula 4> decides 
if G \= <j) in time f{\(p\)n^ then EXP—NEXP. The same is true if C is 
closed under induced subgraphs and, for all d > contains a graph with 
diameter d. 

Finally, we can extend the ideas given above to obtain an alterna- 
tive, self-contained proof of a result given in |3]: MSO2 model-checking 
on cliques is not in XP, unless EXP=NEXP. In [3] this is proved under 
the equivalent assumption Pi t^NPi (the P^^NP assumption for unary 
languages). That proof relies on the work of Fagin on graph spectra |6]. 

Here we can simply reuse the ideas of Theorem [3] by observing two 
basic facts: first, with an appropriate MSO2 formula we can select a set of 
edges in the given clique that induces a spanning path. Therefore, we can 
assume we have the same structure as in the case of paths. Second, the 
eqi predicate can be constructed in constant size, since two disjoint sets 
of vertices are equal if and only if there exists a perfect matching between 
them in the clique (and this is MS02-expressible). 
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Corollary 4. // there exists an algorithm which, given a clique Kn on n 
vertices and an MSO2 formula (p decides if Kn |= (p in n-^^l^^l^, for any 
function f, then EXP^NEXP. 

5 Tree-Depth 

In this section we give a lower bound result that applies to the model- 
checking algorithm for trees of bounded height given by Gajarsky and 
Hhneny [^. We recall here the main result: 

Theorem 4 ([9j). 

Let T he a rooted t-colored tree of height h > 1, and let (j) be an MSO 
sentence with r quantifiers. Then T \= (f) can he decided by an FPT algo- 
rithm in time O (exp(''+i) (2'^+^r{t + r)j + \V{T)\j . 

Theorem |4] is the main algorithmic tool used to obtain the recent ele- 
mentary model-checking algorithms for tree-depth and shrub-depth given 
in j9] and [H], since in both cases the strategy is to interpret the graph 
into a colored tree of bounded height. 

The running time given in Theorem [4] is an elementary function of the 
formula (p, but non-elementary in the height of the tree. Though we would 
very much like to avoid that, it is not hard to see that the dependence on 
at least one of the parameters must be non-elementary, since allowing h 
to grow eventually gives the class of all trees so the lower bound result of 
Frick and Grohe should apply. 

It is less obvious however what the height of the exponentiation tower 
has to be exactly, as a function of h, the height of the tree. The fact that 
we know that the height of the tower must be unbounded (so that we 
eventually get a non-elementary function) does not preclude an algorithm 
that runs in time exp(^''(|0|) or, less ambitiously, exp^^^^\\(l)\), or even 
exp('^~^)(|0|). Recall that we are trying to determine the number of levels 
of exponentiation in the running time here, so shaving off even an additive 
constant would be a non-negligible improvement. 

We show that even such an improvement is probably impossible, and 
Theorem |4] determines precisely the complexity of MSO model-checking 
on colored trees of height h, at least in the sense that it gives exactly the 
correct level of exponentiations. We establish this fact assuming the ETH, 
by combining lower bound ideas which have appeared in |8] and |15j . More 
specifically, the main technical obstacle is comparing indices, or in other 
words, counting economically in our construction. For this, we use the 
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tree representation of numbers of p] pruned to height h — 1. We then use 
roughly log*-'*'' n colors to differentiate the leaves of the constructed trees. 

The basic idea of our reduction is to start from an instance of n- 
variable 3SAT and construct an instance made up of a tree with height h 
colored with t = 0(log^'*'' n) colors. The formula will use 0(1) quantifiers, 
so the algorithm of Theorem [i] would run in roughly exp'-''"'"^) (0(log('*) n)) 
time. If an algorithm running in exp(''+^)(o(log^''^ n)) time existed we 
would be able to obtain a 2°*"^ algorithm for 3SAT. Thus, the algorithm 
is optimal up to the constant factor in the final exponent. 

Theorem 5. If for some constant h > 1 there exists a model- checking 
algorithm for t-colored rooted trees of height h that runs in exp(''"''"'^^(o(t)) • 
poly{n) time for trees with n vertices then the Exponential Time Hypoth- 
esis fails. 

6 Conclusions and Open Problems 

We have proved non-elementary lower bounds for FO logic on cographs 
and MSO logic on uncolored paths. The hope is that, since these lower 
bounds concern very simple graph families, they can be used as "sanity 
checks" guiding the design of future graph widths. We have also given a 
lower bound for MSO logic on colored trees of bounded height. It would 
be interesting to see if this can be extended to uncolored trees. 

Finally, let us mention that a promising direction in this area that 
we did not tackle here is that of alternative logics, besides FO and MSO 
variants. One example is the meta-theorems given by Pilipczuk |16| for 
a kind of modal logic. The algorithmic properties of such logics are still 
mostly unexplored but they may be a good way to evade the lower bounds 
given in [8] and this paper. 
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A Omitted Material 



A.l Proof of Theorem |2] 

Proof. Suppose that we are given a binary word w and an FO formula 
(p. We will reduce the problem of deciding \i w \= (j) to the problem of 
deciding whether G |= (j)' for a threshold graph G and a FO formula (p' 
which we will construct. 

First, let us describe G, and since it's a threshold graph we can describe 
it as a string over the alphabet {u, j}. The graph G starts with uuj. Then, 
for each character of im, if it is a we append uj to the description of G, 
otherwise we append ujj. So, for example the graph corresponding to 
w = 01101 would have description uujujujjujjujujj. Notice that, since 
the last character in the description is a j, the graph is connected. 

Now we need to interpret the formula (p into the new context. To do 
this, let's first observe some basic properties of our graph. First, a vertex 
in this graph is a union vertex if and only if its neighborhood is a clique. 
To see this, note that union vertices are only connected to join vertices, 
which form a clique. All join vertices on the other hand are connected 
to the first two union vertices which are not connected. Second, all union 
vertices, except the first two (dummy) vertices have at least one join vertex 
as a non-neighbor, namely at least the first join vertex. 

We thus define the following formulas 

union{x) := \/y\/z{E{x, y) A E{x, z) Ay ^ z) ^ E{y, z) 
main{x) := union[x) A {3y^union{y) A ^E(x, y)) 

This will allow us to simulate selecting a character in the word by 
selecting the union vertex which represents the corresponding pair or triple 
of vertices in the graph. 

Now we also need to encode the -< and Pi predicates. We define 

prec{x, y) := 3z^union[z) A E[x, z) A -^E{y, z) 

one{x) := 3y3z{y ^ z) A -^union{y) A -^union(z) A E{x, y) A E{x, z) 
A\/w {main{w) A prec{x, w)) — {^E{y, w) A ^E{z, w)) 

The intuition for the first is that, if x,y are two union vertices that 
represent two different blocks, x precedes y if and only if there exists some 
join vertex connected to x but not y. For Pi we have that x is a union 
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vertex representing a ujj block if and only if there exist two join vertices 
connected to it and not connected to any union vertex that comes later 
in the description. 

Given the above it is straightforward to produce the formula (f)' from 
(p: the formulas prec and one are used to translate the corresponding 
atomic predicates -< and Pi, while we inductively replace 3x^l'{x) with 
3x main{x) A tjj'{x) where if^^x) is the translation of 'ip{x). It is not hard 
to see that |</>'| = 0{<j)) while the order of G is 0(|tt;|). □ 



A. 2 Proof of Corollary [T] 

Proof. It suffices to prove this if the class is closed under union and join, 
because if it's closed under union and complement we get closure under 
join "for free". The proof is immediate if Ki S C, since then the class 
contains threshold graphs. Otherwise, let Gm be the graph of the smallest 
order in the class and say it has k vertices. We will construct a graph as in 
the proof of Theorem [2| except that for each vertex we would be adding 
in that case we will add a copy of Gm- More specifically, for each union 
vertex of the threshold graph we add a disconnected copy of Gm to the 
graph we are constructing and for each join vertex we add a copy of Gm 
and connect all its vertices to all previously added vertices. It's easy to 
see that the graph we have constructed is still in C. 

It is now not hard to see how to translate the proof of Theorem|2]in this 
case. Replace every 3x(p{x) with 3x13x2 • • • Bxi^-Gmi^i, • • • , x^) A (pixi), 
where Gm is a formula stating that the Xj's have the structure of Gm 
(that is, they are all distinct and have the same edges as Gm) and they all 
have the same neighbors in the rest of the graph. Knowing that the Xi's 
form a copy of Gm that corresponds to a single character, we then take 
one representative and use it in the rest of the formula (note that if we 
take one representative from each copy the result is a threshold graph). 

This trick is sufficient to translate the formulas for main and prec. 
The only place where we may run into a problem are the union and one 
formula, because we use the 7^ predicate there. Since we are picking xi 
as an arbitrary representative of a copy of Gm, if Gm has a non-trivial 
automorphism it could be the case that the k vertices that correspond 
to 3y and the k vertices that correspond to 3z are assigned to the same 
copy of Gm, but yi ^ zi. To avoid this case we just need to add an extra 
formula after the quantification of y,z stating that all yi,Zj are pairwise 
distinct. □ 
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A. 3 Formal definition of sql 



Let us now give a formal definition of eqi,. First, we need to be able to 
recognize sections. Assume that we have a set of elements U and a subset 
PC U. As explained P divides U into sections so we define a formula 
section{S, U, P) that will be true if S is such a section. 



partsection{S, U, P) := S U A consec{S) A Vx, y G S{x & P y & P) 
section{S, U, P) := partsection{S, U, P) A 



Informally, consec checks if S* is a contiguous subset of U. Then, S is 
a partial section if it's a subset of either P oi U \ P and it represents a 
contiguous subset of U elements. S is a section if it's a maximal partial 
section. 

Assuming two sets 81,82 represent consecutive sections and we have 
a set B which is supposed to encode a binary number i in and i + 1 in 
82 we check this with the following formula: 



consec{8, U) 




V5'(5 C8' A partsection{8' , U, P) ^ 8' = 8) 



next{8i, 82, B) 



38- 



'^3s^38f38^3s^38^ 




i=l,2 




1=1,2 



A {8i<Z8^U{s'}u8^) 



1=1,2 



same{8i, 82, Bi, B2) 



A s"^ e B As^ ^ B A{8i^ C B) Ai8^nB = $) 

AeqLi8^,8i) 

A same{8f,8^,B,B) 

3/i3/2 f\{yxe8i fi-^xV fi = x)A 



i=l,2 



i=l,2 
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3h3l2 /\ ((Vx eS[ x-<liWli = x) 



i=l,2 



Informally, next partitions the sets Si, S2 into a left and right part and 
identifies a vertex in the middle. Assuming |5i| = 1521 respective parts 
have the same size; the right part of Si corresponds to all 1 digits and the 
right part of S2 to all 0. The left parts have to encode the same number, 
which is checked by same. The idea here is that if wc select equal length 
contiguous prefixes of the sets we are checking the last element will encode 
the same digit in both. We allow same to use different sets Bi,B2 to read 
the encoding in Si, ^2. This extra generality will be useful when we reuse 
this formula later. 

We will use next only for neighboring sections. To check if two disjoint 
sections are indeed adjacent we define: 

adj{Si, S2, U) := 3x G Si3y G 52V2; G U{z -< x\/ y < z\/ z = x\/ z = y) 

Informally, adj is true if Si is a section that directly precedes the 
section S2, because then x is the last element of Si and y is the first 
element of S2 and there are no elements between them. 

We are now ready to define eg^ for L' = L2^. 



egl(Pi,P2) := 3Qi3Q23i?i3i?23Si352 

A (Qi ^Pi^Ri^Pi^BiC Pi) A eqL{Ri, R2) A 



i=l,2 




section{S2, P2 \ R2, Q2)) ^ eqiiSi, S2)) A 
/\ (VSVS'((S n S' = 0) A section(S, Pi \ Ri, Qi) A 



i=l,2 



section{S\ \ Ri, Qi) A adj{S, S' , Pi \ Ri)) 




3Sf 3Sf same{S[, Sf , Pi, S2) A 




i=l,2 



yS'{sectioniS', Pi \ Ri, Qi) A (S' n Sf 



0)) 
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^ 3x G S'3y G Sfx ^ y) A 
V5 /\ section{S, Pi \ Ri, Qi) ^ {S\Bi^$) A 

i=l,2 

3S /\ section{S,Pi\Ri,Qi) A{Sr\Bi = ^) 

i=l,2 

This is rather long, so let us explain it intuitively. We want to test if 
I -Pi I = I-P2I, so we demand the following: 

— From both we remove a remainder set Ri, and we make sure that the 
remainder sets are equal. 

— We use Qi, Q2 to partition the two sets into sections. All sections of the 
first set must be equal in size to all sections of the second (therefore, 
all sections in both sets are equal). 

— Select the sets Bi which will encode binary numbers in the sections. 
For each two disjoint sections which are consecutive check that they 
encode consecutive numbers. 

— Find the last section on each set {Si,S2)- Check that they encode 
the same number. 

— Check that no section encodes a number made up only of Is, so we 
don't have an overflow in the counting. 

— Check that the first section on each set encodes the number zero (that 
is, it has no elements from Bi). For this it's sufHcient to check that 
some section encodes zero, since we have already established proper 
ordering. 

Proof (Lemma^. 

Correctness follows by induction and the definition of the construc- 
tion given above. For the size bound, note that for L' = L2^ we have 
= 0{\eqL\), since the definition of eq^' uses eqL a constant number 
of times. It follows that there exists a constant c such that for all k we 
have |egg2.p(fe)(]^-) I = 0{c^). The result follows by setting k = log* L. □ 

A. 4 Formal Definition of exp and Other Arithmetic 

Again, we first define some auxilliary formulas. Checking if a set is twice 
as large as another can be done as follows: 

double{Si, S2) := 3S'iS' C ^2) A eq^SuS') A 69^(^1, ^2 \ S') 
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If we are given three elements x, y, z such that x ~< y ~< z we can check 
that the distance is doubled as follows: 

ddist{x, y, z, U) := B^i C [/ A Vtt ((x ^ u V a; = A (u ^ y)) -H- G Si 
A3S'2 <^ U A'iu {{y -< u\/ y = u) a {u -< z)) ^ u e S2 
Adouble{Si, S2) 

Informally, we select the sets S'i,«S'2 as the sets of elements starting 
from X and up to (but not including) y, and starting from y and up to 
(but not including) z. The second set must be twice as large. Now we can 
define exp: 

exp{Pi,P2) := 3Q3f3s3l{Q CP2)AfeQAseQAleQAf^sA 
"iuiu e P2) ^ {u ~< I \/ u = I) A (s ~< u\/ s = uV f = u) 
eqL{Pi,Q\{l})A 

\/x\/y\/z(x e Q) A{y e Q) A{z eQ) A{x -<y) A{y -< z) A 
-i3u(u eQ) A {{x -<u) A{u-< y)) V {{y -< u) A {u -< z)) 
— >■ ddist{x, y, z, P2) 

In words, we select a set Q from P2 so that any three consecutive 
elements double consecutive distances. The total size of P2 must then be 
(l + 2 + 4 + ... + 2l^i|-i) + l = 2l^il, where the sum is obtained by adding 
the consecutive distances, and we add one at the end because the last 
element I was not counted. 

Finally, we would like a formula that calculates the A;-th root, assuming 
(to keep things simple) that is a power of two. We will first need a 
formula rootLiPi, P2) which will simply be true if IP2I = l-PiP- 

rootL{Pi,P2) ■■= 3Q{Q C P2) A {ySsection{S, P2, Q) eqL{S, Pi)) A 
3S'eqL{S', Pi) A ySsection{S, P2, Q) i\S n S'\ = 1) 

Informally, we can divide P2 into sections of size |Pi| and if we select 
a set S' that contains exactly one representative from each section then 
|S"| = |Pi|. Given this formula, we can now define root^^\Pi,P2), the 
formula that calculates fc-th roots as follows: 

rootf{Pi,P2) ■.= rootL{Pi,P2) 

rootf{PuP2) := 3Q rootiiQ, P2) A root^^''^\Pi,Q) 
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It is not hard to see that root^ {Pi-: P2) is true if I-P2I = l-Pil'^) assuming 
that A; is a power of two. 

We will need to perform division: 

div{Pi, P2) := 3Q C P2\iSsection{S, P2, Q) ^ eqiiS, Pi) 
lessL{Pi,P2) := 35(Pi C S) A 69^(5,^2) 
modL{Pi, P2, R) ■■= 35 C P2 A eqLiS, R) A div{Pi,P2 \S)A less{R, Pi) 

The div formula decides if | Pi | exactly divides | P2 1 by partitioning P2 
into sections of size |Pi|. Using this we can then calculate remainders. 

A. 5 Proof of Theorem |3] 

Proof. Suppose that we are given a non-deterministic Turing machine that 

k 

runs in time 2" , for some constant k, when given n bits of input. We will 
use the hypothetical algorithm to predict whether the machine accepts an 
arbitrary input in deterministic exponential time. 

Let us discuss some technical details about the machine. Without loss 
of generality, assume that k is a power of two and we are given a non- 

k 

deterministic machine that always terminates in time at most T = 2" . 
Assume that the machine uses a binary alphabet, and without loss of 
generality it never uses more than T cells of tape. Also without loss of 
generality, we may assume that the first thing the machine does is non- 
deterministically guess a string of bits and use it to fill out its tape. From 
that point on the machine behaves deterministically, that is, there is a 
finite set of states Q and a transition function 5 : Q x {0, 1} — t- Q x 
{0, 1} X {L, S, R}, that tells the machine for each state and cell character, 
which state to go to next, what to write on the current cell and whether to 
move left, right, or stay at the same cell. The state set Q contains a special 
state Qacc such that if the machine ever enters this state it automatically 
accepts and never leaves this state. 

Suppose that we have been given the description of such a machine 
with \Q\ states, where \Q\ is independent of the input, and n bits of 
input. We will construct a unary string w of appropriate length and an 
MSO formula (j) such that u; |= i;^> if and only if the machine would accept 
this input. 

Let / be the number whose binary representation is exactly the input 
given to the machine (so / < 2"). Construct a unary string w of length 
L = (2/+l)r2, where we recall that T is the upper bound on the machine's 
running time. Now we need to construct the formula (p. 
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Rather than giving all formal details, we will now give a high-level de- 
scription of (j) and the reader may verify that (j) can indeed be constructed 
with the tools from the previous section. Our formula will first ensure the 
following: 

— First, it will identify a subset of the input with size / and another with 
size T^. This is achievable by observing that the largest odd divisor of 
L is 2/ -|- 1, so we simply ask for the largest odd set whose size exactly 
divides the input. 

— Using the root formula we partition the set of size into T equal-size 
sections. Each will correspond to a snapshot of the tape during a step 
in the machine's execution. 

— Identify the first section of the tape, and then identify a prefix of it 
with size (this can be done with the exp formula) . Identify a prefix 
of that with size n (this can be done with the root^^^ formula, since k 
is a constant and a power of two). This is where the machine's input 
will initially stored. 

It should be clear that the above can be expressed in MSO with the 
formulas of the previous section. So at this point, we have identified T 
sections, each of size T, to represent the machine's tape. Each clement 
thus naturally corresponds to a specific cell at a specific point in time 
during the machine's run. We have also identified a special part at the 
start of the first tape section with size n, where we will check that the 
input is stored, and a set of length / that encodes the input. Now we ask 
for the existence of a subset B of elements that will indicate the cells of 
the tape where 1 is written. We also ask for the existence of \Q\ sets, call 
them Hi, i E Q. The intended meaning is that if a certain element from 
one of the tape sections is in Hi, then the machine was in state i at the 
point in time corresponding to that section and the machine's head was 
located at the cell corresponding to that element. 

Once the above sets have been selected all of the machine's computa- 
tion has been encoded. Now we just need to check that it's correct and 
accepting. We thus express the following conditions: 

— Ensure the input is correctly encoded at the start of the tape. To check 
the bit at position i we observe that the contiguous subset of the tape 
from the beginning to that bit has size i. Using exp we can construct a 
set of size 2* and then a set of size 2*+^. We "calculate" the remainder 
of the set of length / with the set of size 2*+^. If the result has size at 
least 2* the bit has to be 1 otherwise 0. With this idea we can check 
the correctness of all input bits. 
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— Ensure that the machine transitions correctly. We look at pairs of 
elements that correspond to the same tape cell in consecutive steps in 
time, that is, tape elements whose distance is exactly T, which can be 
verified with the eqL formula. If the first has no Hi label then either 
both have B or neither does. If the first has an Hi label we check that 
the B label changes appropriately for the other and an Hj label is 
used appropriately for the other or one of its neighbors, depending on 
the transition function. 

— Finally, check that in each section of the tape exactly one element has 
an Hi label, and that it has exactly one. Also, check that some element 
eventually gets the Hq^^^ label. 

All the above requirements can be checked with an MSO formula with 
constant size (assuming k, \Q\ constant), except for uses of the eqL predi- 
cate, which has size 2*-^(^°§ . So the whole formula also has size 2*^^°^ 
and L = 2P°'s^(") so \^\ = 2^(i°s*"). 

Suppose that an algorithm with running time existed for 

elementary /. Then, there exists d such that f{x) < exp('^)(x). So the 
running time is at most exp('^+^)(0(log* n))2^'°'f(") = 2P°^2^("). □ 

A. 6 Proof of Corollary [3] 

Proof. If the class is closed under induced subgraphs and for each d > 
there is a graph in the class with diameter d, then the class contains all 
paths and therefore we can invoke Corollary [2] To see this, for each d take 
the graph with diameter d and let n, v be two vertices with shortest path 
distance d. The graph induced hy u,v and the vertices that make up a 
shortest path from u to f is a path, since if more edges were induced a 
shorter path would exist from u to v. Thus, the class contains a path with 
d + 1 vertices. 

If the class is closed under edge subdivisions we can reduce (in fact 
interpret) the MSO model checking problem on paths to MSO model 
checking on the class. We have a path with n vertices and by the proof 
of Theorem [3] we can assume n to be even. Select the smallest graph in 
C, call it Gm, and subdivide its edges an appropriate number of times 
so that all maximal connected sets of degree two vertices have odd size. 
Select one such set and subdivide its edges so that it has size n (this is 
always possible if we started with a sufficiently large n). It is not hard 
to amend the original formula so that it first locates this path of size n 
that we created (it's now the only maximal connected set of degree two 



22 



vertices with even size) and only works with vertices from it. Since we 
only subdivided edges the graph we have is still in C. □ 

A. 7 Proof of Corollary |4] 

Proof. The proof follows similar lines as in Theorem [3j so we explain 
here the differences. First, we must implement the -< predicate on the 
clique. This is achieved by selecting a set of edges that induces a path and 
then using the same tricks as we use to simulate MSO for strings with 
MSOi for paths. Second, we must implement an eq{Pi, P2) predicate with 
constant size. If we do both of these, the rest of the proof of Theorem [3] 
goes through unchanged, since the formula we construct only uses the -< 
and = predicates and has constant size except for the eqi predicate. 

The main observation is that we can ask for the existence of a perfect 
matching between two disjoint sets as follows (we denote by 3^ edge-set 
quantification, and I{x,e) the vertex-edge incidence relation): 

pm{Si, S2) := 3eFyx{x £ SiV x £ S2) — )■ 

3eG F/(x,e) A(V/G F/(x,/) ^e = /) 

AVe e F3x3y{x G Si) A (y G 52) A I{x, e) A I{y, e) 

Informally, we ask if there exists a set of edges so that each vertex has 
exactly one incident edge and all edges have one endpoint in each set. It 
should be clear that there exists a perfect matching between two disjoint 
sets if and only if the sets have the same size. Thus, we can define eq as 
follows: 

eg(Pi, P2) := pm(Pi \ P2, P2 \ Pi) 

Thus, all that is left is to implement the -< predicate. We give here 
a high-level argument. First, we will ask for the existence of a set F of 
edges with the following properties: 

— All vertices have exactly two edges of F incident on them. 

— For any partition of the vertices there exists an edge from F with 
endpoints on both sides (connectivity). 

It is not hard to see that F induces a spanning cycle. Let F' be the 
set obtained from F by removing an arbitrary edge and let s be one of 
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the endpoints of the removed edge. We will know say that x -< y if any 
subset of edges from F' that connects s to y must touch x. 

The size of the clique we construct is the same as the length of the 
string in Theorem [3] So if there exists an algorithm running in time poly- 
nomial in the order of the input clique for fixed-size formulas then we can 
simulate a NEXP Turing machine in exponential time and EXP=NEXP. 

□ 

A. 8 Proof of Theorem [5] 



o id: • !•: c/^\ i^'^^ii 

1 2 3 5 12 

Fig. 3. An example of the graphs constructed in the proof of Theorem [5] 
Assume we have two labels available, one represented by solid grey fillings 
and the other with a dashed box around vertices that have it. The first 4 
numbers (0, . . . , 3) can be represented by a single vertex. Numbers up to 
2^ — 1 can be represented with trees of height 1, numbers up to 2^ — 1 
with trees of height 2, etc. 




Proof. As usual in such proofs, the main obstacle is how to encode num- 
bers up to n economically in terms of the height of the constructed tree 
and the colors used. Fix some h > 2 (we will handle the case /i = 1 in 
the end). We have at our disposal around log*-'^^ n colors. By using them 
we can create 2^°^*'"'" = log*-'^""'^^ n vertices which we can distinguish by 
using a different set of colors on each vertex. To go from there to n we 
will use the trick of [8j which, roughly speaking, gives exponentially more 
counting power with each level of height added. Thus, we will manage to 
represent numbers up to n with trees of height h — 1. 

Let us now be more precise. We have at our disposal log^'*^ n colors, 
number them 0, . . . , log*-'^^ n — 1. We will define for each i G {0, . . . , n — 1} 
a rooted colored tree Tj. The construction is inductive: 

— If i € {0, . . . , log^'*^^^ 71—1} then i has a binary representation with 
at most log^^^ n bits, say bkbk-i ■ ■ ■ bibo with k < log*-'*-* n—1. The tree 
Ti is a single vertex colored with exactly the colors j such that bj = 1. 
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— Suppose that we have defined Tj for i E {0, . . . , log*^ ' n — 1} for some 
k > 1. We will now define Tj for log^'^) n < i < log^*^~^) n — 1. As 
previously, write down the binary representation of i, which has at 
most log'^'^-' n bits. For each j such that bj = 1 construct a copy of Tj 
(we already know how to do this by the inductive hypothesis). Add a 
new vertex, which will be the root of the new tree, and connect it to 
the roots of the constructed trees. 

Some examples of the above construction are given in Figure [3] Now 
let i be an integer such that 1 < i < h. We observe that the above 
construction represents numbers which are at most < log^'^~*^ n — 1 with 
trees of height i — 1. This can be proved by induction: for i = 1 trees 
of height (that is, single vertices) are used to represent numbers up to 
log^'^^^^ n — 1. For the inductive case, notice that each level of height added 
increases the maximum number representable exponentially. As a result, 
the numbers 0, . . . , n — 1 can be represented with a tree of height h — 1. 

We can now also define an eqk{x, y) predicate, that will be true if and 
only if X, y are the roots of two trees of height at most k representing the 
same number. Again we proceed inductively: 

— It's easy to define a simple propositional predicate samecols(x, y). The 
predicate will be true if x, y have exactly the same colors from the 
set {0, 1, . . . ,log^''^ n — 1}. Using this, for A; = we set eqo{x,y) : = 
samecols(x, y) A Vz-iC(2;, x) A^C{z,y). In other words, x,y are equal 
if they have the same colors and no children. 

— Suppose eqk{x,y) is defined, we will define eqk+i{x,y). We set 

eqk+i{x, y) := samecols(x, y) A Vu (C(u, x) V C(u, y)) — )■ 
3v(^eqk{u,v) A 

{Ciu,x) ^ Civ,y)) A (C(n,y) ^ C(i;,x))) 

In words, x, y have the same colors and for every vertex that is the 
child of one of them there exists a vertex that is a child of the other 
and these two vertices represent the same number. 

It is not hard to see that the formula eqh{x,y) uses 0{h) quantifiers. 
We are now ready to describe our construction. 

Fix h > 1 and start with an instance of 3SAT with n variables and 
suppose that these variables are named Xj, i G {n,...,2n — 1}. The reason 
we number the variables this way is that it will be convenient for all of 
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them to have an index high enough that a non-trivial tree is needed to 
describe it. For each variable Xi construct a copy of the tree Tj as described 
above. We will make use of log^'*^ 2n = log^'*^ n + o(l) colors so trees have 
height at most h — 1. Color the roots of all these trees with a new color, 
call it V. 

For each clause {li V Ij V Ik) where li, lj,lk are literals, that is, positive 
or negative appearances of the variables Xi,Xj,Xk respectively, construct 
three trees Ti,Tj,Tk- Introduce six new colors, call them Cp^q for p G 
[S],q € {0,1}. If the literal li is positive then color the children of the 
root of the tree Tj with ci,i, otherwise color them with ci,o. Similarly, 
color the children of the root of Tj with C2,i if Ij is positive and C2,o if it's 
negative and the children of the root of with 03,1 or C3,o. Notice that we 
know that all trees have height at least 1 because h > 2 and variables are 
numbered n, . . . , 2n — 1, so all roots do have children. Finally, merge the 
roots of Ti, Tj,Tk into a single vertex. We introduce a new color, call it c, 
and use it to color the new root of the tree that represents each clause. To 
complete the construction, add a new root vertex to the graph and make 
all roots of previously constructed trees its children. This creates a tree 
with height h. The root has one v-colored child for each variable and one 
c-colorcd child for each clause of the 3SAT formula. 

Now, let us describe a formula with 0(h) quantifiers that will check if 
the original 3SAT instance was satisfiable. Informally, we will ask if there 
exists a set of variables, represented by a subset of the vertices colored 
with V, such that setting these to true and the rest to false satisfies the 
formula. To do this, we need to be able to check if a variable appears 
positive or negative in a clause. Let's define a predicate posi{x,y) which 
will be true if variable x appears positive in position i (where z G [3]) in 
the clause represented by y (so x is assumed to be the root of a variable 
tree and y is assumed to be the root of a clause tree) . 

posi{x, y) := \Ju (c{u, x) V {C{u, y) A Pc^.i (w))) ^ 

3i; (eqh{u,v) A 

iC(u, x) ^ iC{v, y) A Pe,,i (y))) A {C{u, y) ^ C{v, x))) 

The logic here is exactly the same as in the eq predicate, except that 
we only take into account the children of the clause node that correspond 
to the i-th literal. It's easy to see how to make a similar predicate negi 
for negative appearances (change 1 to Pao)- Given these, the complete 
formula is: 



26 



isSAT := 3S(Vx(x eS^ Pv{x))) A 
Vy(Pc(y) ^ MPv{Z) A 

(Vie[3](posi(2;,y) A z G S)) V (Vj6[3](ne5j(z, y) A e 5)))) 

In words, there exists a set of variables S (which will be set to true), 
such that for each clause there exists a variable appearing in it that sat- 
isfies it, that is, it belongs in S if and only if its appearance is positive. 

The formula has h quantifiers, we have used log*^'*-' n + 0{1) colors 
and have constructed a tree of height h and size polynomial in n. If 
there exists a model-checking algorithm for t-colored trees running in 
gxp(/»+i)(o(t))|y|'= this gives a 2"(") algorithm for 3-SAT. 

The only thing left is the case h = 1. Here each variable and each 
clause will be represented by a single vertex and, since we have O(logn) 
colors available, the colors alone will be sufficient to compare indices. It's 
not hard to see how to encode the whole structure of the formula using 
71ogn colors. The first logn colors are used for the variable vertices. 
Then we need 6 sets of log n distinct colors to encode the appearances of 
literals into the clauses, for each combination of position and positivity. 
This makes it straightforward to implement posi and negi by comparing 
appropriate sets of colors on the two vertices. The rest of the formula is 
unchanged. □ 
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